Skip to content

Add early cluster validation to prevent wasted build time in func deploy#3117

Merged
knative-prow[bot] merged 6 commits intoknative:mainfrom
RayyanSeliya:cluster-errors-deploy
Nov 11, 2025
Merged

Add early cluster validation to prevent wasted build time in func deploy#3117
knative-prow[bot] merged 6 commits intoknative:mainfrom
RayyanSeliya:cluster-errors-deploy

Conversation

@RayyanSeliya
Copy link
Copy Markdown
Contributor

Changes

  • 🐛 Add early cluster validation to prevent wasted build time in func deploy
  • 🧹 Improve error messages for cluster connection failures with actionable guidance

What Changed

func deploy now validates Kubernetes cluster connectivity before starting the container build process, preventing wasted time when the cluster is inaccessible.

Before:

  • Built container image first (2-5 minutes)
  • Then failed with confusing errors like "invalid run-image" or "context canceled"
  • Different errors depending on whether --build=false was used

After:

  • Validates cluster connection immediately (< 5 seconds)
  • Fails fast with clear, specific error messages
  • Consistent behavior regardless of build flags
  • Provides actionable guidance for each error type

Implementation

Added 2-layer error handling:

  • Layer 1 (pkg/functions/errors.go): Technical errors (ErrInvalidKubeconfig, ErrClusterNotAccessible)
  • Layer 2 (cmd/deploy.go): User-friendly CLI messages with examples

Detects three distinct error scenarios:

  1. Invalid kubeconfig file path
  2. Empty/no cluster configuration
  3. Cluster unreachable (network, auth, down, etc.)

Testing

Tested all combinations:

  • ✅ Invalid KUBECONFIG path (with/without build flags)
  • ✅ Empty kubeconfig (with/without build flags)
  • ✅ Unreachable cluster (with/without build flags)
  • ✅ Cluster stopped after configuration (network error)
  • ✅ Valid cluster with kind (success path)
  • ✅ Unit tests pass (TestDeploy_ConfigPrecedence, etc.)

/kind bug

Fixes #3116

Release Note

`func deploy` now validates cluster connectivity before building, providing immediate feedback with clear error messages instead of wasting time on builds that will fail deployment.

@knative-prow knative-prow Bot added the kind/bug Bugs label Oct 19, 2025
@knative-prow knative-prow Bot added the needs-ok-to-test 🤖 Needs an org member to approve testing label Oct 19, 2025
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Oct 19, 2025

Hi @RayyanSeliya. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@knative-prow knative-prow Bot added the size/L 🤖 PR changes 100-499 lines, ignoring generated files. label Oct 19, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 19, 2025

Codecov Report

❌ Patch coverage is 12.16216% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.14%. Comparing base (63026ce) to head (c4652dd).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
cmd/deploy.go 0.00% 38 Missing ⚠️
pkg/knative/deployer.go 24.00% 16 Missing and 3 partials ⚠️
pkg/knative/client.go 27.27% 4 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3117      +/-   ##
==========================================
+ Coverage   59.91%   63.14%   +3.22%     
==========================================
  Files         150      150              
  Lines       13353    13501     +148     
==========================================
+ Hits         8001     8525     +524     
+ Misses       4416     3971     -445     
- Partials      936     1005      +69     
Flag Coverage Δ
e2e-tests 42.29% <12.16%> (+11.05%) ⬆️
integration-tests 57.60% <12.16%> (+1.21%) ⬆️
unit-tests 49.46% <0.00%> (-0.52%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gauron99
Copy link
Copy Markdown
Contributor

/ok-to-test

@knative-prow knative-prow Bot added ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test 🤖 Needs an org member to approve testing labels Oct 19, 2025
@gauron99 gauron99 requested review from gauron99, lkingland and matejvasek and removed request for dsimansk and jrangelramos October 19, 2025 15:14
Comment thread cmd/deploy.go Outdated
Comment thread cmd/deploy.go Outdated
Comment thread cmd/deploy.go Outdated
@RayyanSeliya RayyanSeliya requested a review from gauron99 October 20, 2025 11:38
@gauron99
Copy link
Copy Markdown
Contributor

Im not much of a fan of this implementation... its still "test aware" as Matej said - with those literal strings and constants. Also there has to be some better way to implement this 🤔
Maybe as a method of client Instance (in pkg/functions/client.go) ? Instance has its method "CurrentState" or "State" or "Available" or something like this?
@lkingland @matejvasek

@RayyanSeliya
Copy link
Copy Markdown
Contributor Author

Im not much of a fan of this implementation... its still "test aware" as Matej said - with those literal strings and constants. Also there has to be some better way to implement this 🤔 Maybe as a method of client Instance (in pkg/functions/client.go) ? Instance has its method "CurrentState" or "State" or "Available" or something like this? @lkingland @matejvasek

@gauron99 Good point about the client architecture! I checked pkg/functions/client.go and saw it already has methods like Deploy(), Build(), etc.

I'm thinking we could add:

func (c *Client) ClusterAvailable(ctx context.Context) error

Then just call it before building. Tests would naturally skip it since they use NewTestClient() with mocks.

The challenge is timing - we currently validate before creating the client (line 423 vs line 462 in cmd/deploy.go's runDeploy()). We'd need to either:

  1. Create the client earlier, or
  2. Make it a standalone function like functions.ValidateCluster()

Which direction makes more sense for the existing code? Don't want to refactor the way which is not feasible ..!

@lkingland @matejvasek

@lkingland
Copy link
Copy Markdown
Member

This one is a little tricky...
I understand the desire to validate in the CLI to "fail fast". But too much pre-validation creates a hard dependency between the CLI and the deployer.

I think it might be worth looking into placing this validation in the deployer implementation itself, returing a typed error on failure, and then capturing this error in the CLI and adding "CLI specific" help text as necessary.

Remember that it's ok if the system builds and then fails on deployment, because it should not repeat the build on a subsequent deploy (it detects the build is "fresh").

@RayyanSeliya RayyanSeliya force-pushed the cluster-errors-deploy branch from 02c38ab to 0c72cf2 Compare November 5, 2025 16:09
@RayyanSeliya
Copy link
Copy Markdown
Contributor Author

This one is a little tricky... I understand the desire to validate in the CLI to "fail fast". But too much pre-validation creates a hard dependency between the CLI and the deployer.

I think it might be worth looking into placing this validation in the deployer implementation itself, returing a typed error on failure, and then capturing this error in the CLI and adding "CLI specific" help text as necessary.

Remember that it's ok if the system builds and then fails on deployment, because it should not repeat the build on a subsequent deploy (it detects the build is "fresh").

thx for the feedback @lkingland that makes sense and sounds good to have the validation into the deployer itself with typed errors and CLI just catches those and provide a user-friendly errors ! can have a look now and ping me if any more changes needed !

@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Nov 9, 2025

@RayyanSeliya: GitHub didn't allow me to request PR reviews from the following users: take, when, some, moments, pls, a, look, have.

Note that only knative members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @lkingland pls take a look when have some moments

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@RayyanSeliya
Copy link
Copy Markdown
Contributor Author

/cc @lkingland

Copy link
Copy Markdown
Member

@lkingland lkingland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks for taking the time to work through a few iterations on this one

@knative-prow knative-prow Bot added the lgtm 🤖 PR is ready to be merged. label Nov 11, 2025
@knative-prow knative-prow Bot added the approved 🤖 PR has been approved by an approver from all required OWNERS files. label Nov 11, 2025
@knative-prow-robot knative-prow-robot added the needs-rebase Cannot be merged due to conflicts with HEAD. label Nov 11, 2025
Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
…er instead of too much pre-validation at the cli

Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
@knative-prow knative-prow Bot removed the lgtm 🤖 PR is ready to be merged. label Nov 11, 2025
@knative-prow-robot knative-prow-robot removed the needs-rebase Cannot be merged due to conflicts with HEAD. label Nov 11, 2025
Signed-off-by: RayyanSeliya <rayyanseliya786@gmail.com>
@RayyanSeliya
Copy link
Copy Markdown
Contributor Author

hey @lkingland dont know why the test are failing !!

@gauron99
Copy link
Copy Markdown
Contributor

/override ?

@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Nov 11, 2025

@gauron99: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • ?

Only the following failed contexts/checkruns were expected:

  • E2E Test (ubuntu-latest, springboot)
  • EasyCLA
  • On Cluster RT Test (ubuntu-latest, pack)
  • style / suggester / github_actions
  • style / suggester / shell
  • style / suggester / yaml
  • tide
  • unit-tests_func_main

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

/override ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gauron99
Copy link
Copy Markdown
Contributor

/override "On Cluster RT Test (ubuntu-latest, pack)" "E2E Test (ubuntu-latest, springboot)"

@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Nov 11, 2025

@gauron99: Overrode contexts on behalf of gauron99: E2E Test (ubuntu-latest, springboot), On Cluster RT Test (ubuntu-latest, pack)

Details

In response to this:

/override "On Cluster RT Test (ubuntu-latest, pack)" "E2E Test (ubuntu-latest, springboot)"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gauron99
Copy link
Copy Markdown
Contributor

These fails are unrelated. Its an issue with our custom pack builder currently

@gauron99
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@knative-prow knative-prow Bot added the lgtm 🤖 PR is ready to be merged. label Nov 11, 2025
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Nov 11, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gauron99, lkingland, RayyanSeliya

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@RayyanSeliya
Copy link
Copy Markdown
Contributor Author

These fails are unrelated. Its an issue with our custom pack builder currently

Yeah ! 👍Thx ..

@knative-prow knative-prow Bot merged commit 0e07ee7 into knative:main Nov 11, 2025
46 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved 🤖 PR has been approved by an approver from all required OWNERS files. kind/bug Bugs lgtm 🤖 PR is ready to be merged. ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. size/L 🤖 PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

func deploy: Add early cluster validation to prevent wasted build time

5 participants